Take-home Exercise 3

Published

March 18, 2023

Modified

March 26, 2023

1 Introduction

Around 77.9% of dwellings in Singapore are made up of Housing and Development Board (HDB) flats, public housing many residents are familiar with and take pride in.

Though the number of households living in HDB apartments continues to increase, there is a steady fall in number of Singaporeans actually living in these flats. One argument provided is that more Singaporeans are selling their flats and “upgrading” to private housings, like condominium and private landed properties. Among those upgrading, selling their current house to afford for a better HDB apartments is not uncommon too.

On the flip side, there is the demand and there are many reasons for residents to buy resale flats.

  1. Families may not be keen to wait approximately 2 to 5.9 years for an apartment under the Build-to-Order exercise.
  2. Couples, being both Permanent Residents, can only access the resale market, if they want to buy a HDB flat.
  3. Single citizens similarly are only allowed to purchase resale flats.

In spite of the ever increasing resale prices, increasing 8.7% YoY, and much more expensive than a BTO flat, resale flats continue to be in the cards for residents to buy their dream homes.

With how expensive these apartments are, you better bet that people will be choosing where to place their money more carefully.

2 Import Packages

Below the packages that we will be using in this exercise:

pacman::p_load(sf, tmap, sfdep, tidyverse, olsrr, ggpubr, GWmodel, SpatialML, tidymodels, jsonlite, readxl, Rfast, corrplot, gtsummary, spdep, Metrics) 

3 Import Data

Dataset Source
HDB Resale Flat Prices Data.gov.sg
Masterplan 2019 Subzone Boundary Professor Kam Tin Seong
Hawker Centres OpenMapSG
Elderly Centres OpenMapSG
Parks OpenMapSG
Kindergartens OpenMapSG
Childcare Centres OpenMapSG
Bus Stops LTA DataMall
MRT Stations LTA DataMall
Primary Schools OpenMapSG
Supermarkets OpenMapSG
Malls Wikipedia
HDB MUP/HIP Status Housing Development Board

3.1 Import HDB Resale Flat Prices

resale_prices <- read_csv("data/aspatial/resale-flat-prices-based-on-registration-date-from-jan-2017-onwards.csv")
head(resale_prices, 5)
# A tibble: 5 × 11
  month   town     flat_…¹ block stree…² store…³ floor…⁴ flat_…⁵ lease…⁶ remai…⁷
  <chr>   <chr>    <chr>   <chr> <chr>   <chr>     <dbl> <chr>     <dbl> <chr>  
1 2017-01 ANG MO … 2 ROOM  406   ANG MO… 10 TO …      44 Improv…    1979 61 yea…
2 2017-01 ANG MO … 3 ROOM  108   ANG MO… 01 TO …      67 New Ge…    1978 60 yea…
3 2017-01 ANG MO … 3 ROOM  602   ANG MO… 01 TO …      67 New Ge…    1980 62 yea…
4 2017-01 ANG MO … 3 ROOM  465   ANG MO… 04 TO …      68 New Ge…    1980 62 yea…
5 2017-01 ANG MO … 3 ROOM  601   ANG MO… 01 TO …      67 New Ge…    1980 62 yea…
# … with 1 more variable: resale_price <dbl>, and abbreviated variable names
#   ¹​flat_type, ²​street_name, ³​storey_range, ⁴​floor_area_sqm, ⁵​flat_model,
#   ⁶​lease_commence_date, ⁷​remaining_lease

3.1.0.1 Convert “Month” to DataTime Format

As we want to have the abilities to filter the data according to its dates, it’s best for us to convert the date column to a DateTime format.

resale_prices$date <- zoo::as.Date(zoo::as.yearmon(resale_prices$month))
head(resale_prices$date, 5)
[1] "2017-01-01" "2017-01-01" "2017-01-01" "2017-01-01" "2017-01-01"
3.1.0.1.1 Data - Filter Jan 2021 to Feb 2023

Here, we will filter the data so we will only be working with those from January 2021 to February 2023.

resale_prices_total <- resale_prices %>% filter(date >= as.Date('2021-01-01')) %>% filter(date <= as.Date('2023-02-01'))
head(resale_prices_total, 5)
# A tibble: 5 × 12
  month   town     flat_…¹ block stree…² store…³ floor…⁴ flat_…⁵ lease…⁶ remai…⁷
  <chr>   <chr>    <chr>   <chr> <chr>   <chr>     <dbl> <chr>     <dbl> <chr>  
1 2021-01 ANG MO … 2 ROOM  170   ANG MO… 01 TO …      45 Improv…    1986 64 yea…
2 2021-01 ANG MO … 2 ROOM  170   ANG MO… 07 TO …      45 Improv…    1986 64 yea…
3 2021-01 ANG MO … 3 ROOM  331   ANG MO… 04 TO …      68 New Ge…    1981 59 yea…
4 2021-01 ANG MO … 3 ROOM  534   ANG MO… 04 TO …      68 New Ge…    1980 58 yea…
5 2021-01 ANG MO … 3 ROOM  561   ANG MO… 01 TO …      68 New Ge…    1980 58 yea…
# … with 2 more variables: resale_price <dbl>, date <date>, and abbreviated
#   variable names ¹​flat_type, ²​street_name, ³​storey_range, ⁴​floor_area_sqm,
#   ⁵​flat_model, ⁶​lease_commence_date, ⁷​remaining_lease

3.2 Import Subzone Data

subzone_sf <- st_read(dsn="data/geospatial/MPSZ-2019", layer="MPSZ-2019") %>% st_transform(crs = 3414)
Reading layer `MPSZ-2019' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\MPSZ-2019' 
  using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS:  WGS 84
head(subzone_sf)
Simple feature collection with 6 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 8012.578 ymin: 22108.68 xmax: 35287.9 ymax: 31092.38
Projected CRS: SVY21 / Singapore TM
                SUBZONE_N SUBZONE_C      PLN_AREA_N PLN_AREA_C       REGION_N
1             MARINA EAST    MESZ01     MARINA EAST         ME CENTRAL REGION
2        INSTITUTION HILL    RVSZ05    RIVER VALLEY         RV CENTRAL REGION
3          ROBERTSON QUAY    SRSZ01 SINGAPORE RIVER         SR CENTRAL REGION
4 JURONG ISLAND AND BUKOM    WISZ01 WESTERN ISLANDS         WI    WEST REGION
5            FORT CANNING    MUSZ02          MUSEUM         MU CENTRAL REGION
6        MARINA EAST (MP)    MPSZ05   MARINE PARADE         MP CENTRAL REGION
  REGION_C                       geometry
1       CR MULTIPOLYGON (((33222.98 29...
2       CR MULTIPOLYGON (((28481.45 30...
3       CR MULTIPOLYGON (((28087.34 30...
4       WR MULTIPOLYGON (((14557.7 304...
5       CR MULTIPOLYGON (((29542.53 31...
6       CR MULTIPOLYGON (((35279.55 30...

3.3 Import Independent Variables

With any creation of a predictive model, comes the independent variables that the predictions will rely on.

3.3.1 Geospatial Datasets

Since this project is pertain to Singapore, OpenMapSG will be helpful for us to get these geospatial datasets with ease.

As you can see below, we will be using a package called onemapsgapi that wraps OpenMapSG API.

Using it is relatively simple:

  1. One will need to register an account with OpenMapSG
  2. Get a token using get_token()
  3. Search for datasets they want in search_themes()
  4. Optionally, look up on the status on desired dataset through get_theme_status()
  5. And finally get your data from get_theme()!
library(onemapsgapi)
token <- get_token("Your Email", "Your Password")
search_themes(token, "searchval")
get_theme_status(token, "themename")
themetibble <- get_theme(token, "themename")
Note

I will have already performed the above steps to acquire my datasets. In our case, I will directly read the geospatial data I have downloaded using st_read().

hawkercentre_sf <- st_read(dsn="data/geospatial/hawkercentre", layer="hawkercentre") %>% st_transform(crs = 3414)
Reading layer `hawkercentre' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\hawkercentre' 
  using driver `ESRI Shapefile'
Simple feature collection with 125 features and 18 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.6974 ymin: 1.272716 xmax: 103.9882 ymax: 1.449017
Geodetic CRS:  WGS 84
eldercare_sf <- st_read(dsn="data/geospatial/eldercare", layer="eldercare") %>% st_transform(crs = 3414)
Reading layer `eldercare' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\eldercare' 
  using driver `ESRI Shapefile'
Simple feature collection with 133 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.7119 ymin: 1.271472 xmax: 103.9561 ymax: 1.439561
Geodetic CRS:  WGS 84
parks_sf <- st_read(dsn="data/geospatial/nationalparks", layer="nationalparks") %>% st_transform(crs = 3414)
Reading layer `nationalparks' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\nationalparks' 
  using driver `ESRI Shapefile'
Simple feature collection with 421 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.6929 ymin: 1.214491 xmax: 104.0538 ymax: 1.462094
Geodetic CRS:  WGS 84
kindergartens_sf <- st_read(dsn="data/geospatial/kindergartens", layer="kindergartens") %>% st_transform(crs = 3414)
Reading layer `kindergartens' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\kindergartens' 
  using driver `ESRI Shapefile'
Simple feature collection with 448 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.6887 ymin: 1.247759 xmax: 103.9717 ymax: 1.455452
Geodetic CRS:  WGS 84
childcare_sf <- st_read(dsn="data/geospatial/childcare", layer="childcare") %>% st_transform(crs = 3414)
Reading layer `childcare' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\childcare' 
  using driver `ESRI Shapefile'
Simple feature collection with 1925 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.6878 ymin: 1.247759 xmax: 103.9897 ymax: 1.462134
Geodetic CRS:  WGS 84
busstop_sf <- st_read(dsn="data/geospatial/BusStop_Feb2023", layer="BusStop") %>% st_transform(crs = 3414)
Reading layer `BusStop' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\BusStop_Feb2023' 
  using driver `ESRI Shapefile'
Simple feature collection with 5159 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
mrt_sf <- st_read(dsn="data/geospatial/TrainStation_Feb2023", layer="RapidTransitSystemStation") %>% st_transform(crs = 3414)
Reading layer `RapidTransitSystemStation' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\TrainStation_Feb2023' 
  using driver `ESRI Shapefile'
Simple feature collection with 220 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 6068.209 ymin: 27478.44 xmax: 45377.5 ymax: 47913.58
Projected CRS: SVY21
supermarkets_sf <- st_read(dsn="data/geospatial/supermarkets", layer="SUPERMARKETS") %>% st_transform(crs = 3414)
Reading layer `SUPERMARKETS' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\supermarkets' 
  using driver `ESRI Shapefile'
Simple feature collection with 526 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 4901.188 ymin: 25529.08 xmax: 46948.22 ymax: 49233.6
Projected CRS: SVY21

3.3.2 Aspatial Data

primarySch <- fromJSON("data/geospatial/primaryschools/primaryschools.json")[["SearchResults"]][-1,c(2:5, 7, 13:14)]
primarySch_sf <- st_as_sf(primarySch, coords=c("LONGITUDE", "LATITUDE"), crs=4326) %>% st_transform(crs = 3414)
malls <- read_xlsx("data/aspatial/malls-20230320.xlsx")
hdb_upgrades <- read_xlsx("data/aspatial/HDB_HIP-MUP-20230312.xlsx")

4 Data Wrangling

4.1 Central Business District

cbd_sf <- subzone_sf %>% filter(subzone_sf$PLN_AREA_N == "DOWNTOWN CORE")

4.2 Primary Schools

4.2.1 Scraping of Ranking Data

To make full use of primary schools as an independent variable, we need to merge the ranking we get from Schlah’s Primary School Rankings. It provides substantial data in how it derived its ranking, which offers a more objective rank.

We then scraped this data from the website.

primarySch_ranking <- fromJSON("data/geospatial/primaryschools/primaryschools_rankings.json")[["props"]][["pageProps"]][["sortedSchools"]]
head(primarySch_ranking, 5)
  rank total     schoolInfo.schoolSlug schoolInfo.schoolCode
1    1   186    nanyang-primary-school                  5258
2    2   186            tao-nan-school                  5240
3    3   186      catholic-high-school                  7102
4    4   186    nan-hua-primary-school                  5622
5    5   186 st-hilda-s-primary-school                  5025
       schoolInfo.schoolName schoolInfo.schoolNameZh
1     Nanyang Primary School                南洋小学
2             Tao Nan School                        
3       Catholic High School                公教中学
4     Nan Hua Primary School             南 华 小 学
5 St. Hilda's Primary School            圣希尔达小学
                                      schoolInfo.schoolLogoUrl
1   https://beta.moe.gov.sg/uploads/nanyang-primary-school.jpg
2           https://beta.moe.gov.sg/uploads/tao-nan-school.jpg
3     https://beta.moe.gov.sg/uploads/catholic-high-school.jpg
4   https://beta.moe.gov.sg/uploads/nan-hua-primary-school.jpg
5 https://beta.moe.gov.sg/uploads/st-hildas-primary-school.jpg
  schoolInfo.schoolLevel schoolInfo.schoolZone   schoolInfo.schoolType
1         Primary School                  West Government-Aided School
2         Primary School                  East Government-Aided School
3         Primary School                 North Government-Aided School
4         Primary School                  West Government-Aided School
5         Primary School                  East Government-Aided School
  schoolInfo.schoolNature schoolInfo.schoolSession
1            Co-Ed School           Single Session
2            Co-Ed School           Single Session
3            Boys' School           Single Session
4            Co-Ed School           Single Session
5            Co-Ed School           Single Session
                   schoolInfo.website schoolInfo.address schoolInfo.schoolArea
1         https://www.nyps.moe.edu.sg     52 King's Road           Bukit Timah
2       https://www.taonan.moe.edu.sg 49 Marine Crescent         Marine Parade
3 https://www.catholichigh.moe.edu.sg 9 Bishan Street 22                Bishan
4    https://www.nanhuapri.moe.edu.sg   30 Jalan Lempeng              Clementi
5         https://www.shps.moe.edu.sg   2 Tampines Ave 3              Tampines
  schoolInfo.postalCode schoolInfo.tel      schoolInfo.email
1                268097       64672677       nyps@moe.edu.sg
2                449761       64428307 taonan_sch@moe.edu.sg
3                579767       64589869        chs@moe.edu.sg
4                128806       67788050       nhps@moe.edu.sg
5                529706       67811916       shps@moe.edu.sg
  schoolInfo.latitude schoolInfo.longitude
1            1.320847             103.8078
2            1.305285             103.9116
3            1.354723             103.8449
4            1.306116             103.7651
5            1.349350             103.9379
                                                                                                                                                                                                                                                                                       schoolInfo.nearestSlugs
1 raffles-girls-primary-school, singapore-chinese-girls-primary-school, new-town-primary-school, henry-park-primary-school, queenstown-primary-school, methodist-girls-school-primary, anglo-chinese-school-primary, fairfield-methodist-school-primary, alexandra-primary-school, anglo-chinese-school-junior
2                                                                    chij-katong-primary, ngee-ann-primary-school, haig-girls-school, st-stephen-s-school, opera-estate-primary-school, eunos-primary-school, kong-hwa-school, maha-bodhi-school, telok-kurau-primary-school, geylang-methodist-school-primary
3                                   guangyang-primary-school, townsville-primary-school, kuo-chuan-presbyterian-primary-school, teck-ghee-primary-school, ai-tong-school, marymount-convent-school, ang-mo-kio-primary-school, kheng-cheng-school, first-toa-payoh-primary-school, st-gabriel-s-primary-school
4                        clementi-primary-school, pei-tong-primary-school, qifa-primary-school, fairfield-methodist-school-primary, henry-park-primary-school, bukit-timah-primary-school, methodist-girls-school-primary, pei-hwa-presbyterian-primary-school, new-town-primary-school, keming-primary-school
5                                              junyuan-primary-school, tampines-primary-school, poi-ching-school, yumin-primary-school, chongzheng-primary-school, gongshang-primary-school, angsana-primary-school, st-anthony-s-canossian-primary-school, tampines-north-primary-school, red-swastika-school
                                          schoolInfo.schoolStatus
1 Special Assistance Plan (SAP), Gifted Education Programme (GEP)
2 Special Assistance Plan (SAP), Gifted Education Programme (GEP)
3 Special Assistance Plan (SAP), Gifted Education Programme (GEP)
4 Special Assistance Plan (SAP), Gifted Education Programme (GEP)
5                                Gifted Education Programme (GEP)
                              schoolInfo.specialNeeds
1 Allied educators (Learning and behavioural support)
2 Allied educators (Learning and behavioural support)
3 Allied educators (Learning and behavioural support)
4 Allied educators (Learning and behavioural support)
5 Allied educators (Learning and behavioural support)
                                                                                                                                                                                                                                                                                      schoolInfo.schoolMission
1                                                                                                Developing our pupils to reach their fullest potential within a bicultural environment that is steeped in diligence, prudence, respectability and simplicity, thereby enabling them to contribute to society.
2                                                                                                                                                                                                                                To nurture innovative pupils of exemplary character with a love for learning.
3                                                                  To establish Catholic High School as a school of distinction in innovative and challenging programmes, a forerunner in character building and a beacon for the mindset of excellence, firmly built upon the foundation of Christian values.
4                                                                                                                                                                                                                                             Nurturing Gracious Citizens rooted in Values and Chinese Culture
5 As an Anglican school, we are committed to providing a balanced development of body, mind and spirit for our students and nurturing God-fearing citizens for our nation.Our school values are:  Serve Humbly, Love Sincerely, Learn Continuously, Lead Wisely, Live ResponsiblyOur School motto:  Go Forward
                                                                                                                                                       schoolInfo.schoolVision
1                                                                                                                                     Learners of Character, Leaders in Action
2                                                                                                                                              Love to Learn and Learn to Love
3 The Catholic High student is a leader, gentleman and bilingual scholar of high integrity and robust character, who is passionate about life, learning and service to others.
4                                                                                                  A School of Engaged Learners who Lead with Character and Serve with a Heart
5                                                                          One Hildan Family anchored on Godly values - Nurturing Servant Leaders and Changemakers of Tomorrow
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              schoolInfo.schoolEchos
1 Nanyang Primary School (NYPS) was established in 1917 as the primary section of The Singapore Nanyang Girls' School. NYPS was separated from the secondary section in 1978 due to increasing pupil enrolment and has functioned as a single-session school since 2004.  It was accorded the Special Assistance Plan (SAP) status in 1984.  In 1990, it began to offer MOE's Gifted Education Programme (GEP)., The school strives to provide an education which prepares the students beyond academic excellence. With the provision of a bi-cultural learning environment, students are nurtured to be learners of character and leaders who will create a positive impact on their communities and society., The Nanyang Curriculum adopts the Head, Heart and Hands approach for the holistic development of the students, as the school believes that internalisation, reflection and putting into action values and skills, are key to learning. The key leverage for the holistic development of students in the Nanyang Curriculum is differentiation which caters to students' different learning needs. Building on its rich Chinese traditions and values while equipping our students with 21st century skills and competencies underscores curriculum innovation in NYPS., An extensive culture for learning and innovation permeates the school as evident in the whole school embarking on the Professional Learning Community (PLC) journey, introducing Lesson Study (LS) as an additional tool to Action Research (AR) and Learning Circle (LC) for teachers to implement curriculum innovations that will deepen their pedagogical knowledge., The school also believes in establishing strong partnership for the holistic development of students.  The key partners comprising the School Management Committee, Nanyang Schools Alumni Association and Parent Teacher Association work collaboratively to preserve the NYPS spirit and identity, reinforcing the values-based culture via role-modelling and their contributions and support.
2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Guided by our mission to nurture innovative pupils of exemplary character with a love for learning, we are committed to nurturing and developing each child to be the best that he or she can be. We are committed to pushing the boundaries of excellence in curriculum innovation to provide an all-round, values-based education for our students. As a Special Assistance Plan (SAP) school, we will continue to provide a holistic Chinese culture learning experience for our pupils. With strong support from our alumni, parents, the Singapore Hokkien Huay Kuan and hard work from our pupils and teachers, the school will be able to help every pupil realise his or her full potential.
3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Founded in 1935 by the late Rev. Fr. Edward Becheras, Catholic High School is one of Singapore's highly regarded institutions under the auspices of the Catholic Archdiocese. As a Special Assistance Plan (SAP) school, CHS is grounded in the philosophy of bilingualism and biculturalism. As a Catholic mission school, love is the motivation behind all our actions. We ensure there is Joy of Learning in school life, and that students develop a Mindset of Excellence. The school's core values underpin every member's attitude, actions and aspirations. Our heritage, alumni and parent support group are the strengths that our school draws upon for our model of education, where every student is a Leader, Gentleman and Bilingual Scholar in the making. Offering the unique dual-track model with both the O-Level and Integrated Programmes (IP), the CHS experience offers a curriculum strong in academic distinction, leadership and character development, and sports and aesthetics excellence. Our IP students will spend their first 4 years in CHS before progressing to Eunoia Junior College to complete the IP.
4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Nan Hua Primary school is a government-aided school which was started in 1917.  As, a Special Assistance Plan (SAP) school, our students learn both the English and Chinese as, first languages. We are also one of the nine Gifted Education Programme (GEP) Centres in, Singapore., Guided by the school's vision and mission and with the school's motto (Loyalty, Filial Piety, Humanity, Love, Courtesy, Righteousness, Integrity and Sense of, Shame) as the foundational values, we aim to help every Nan Hua child develop into, gracious citizen rooted in values and Chinese culture who is also a caring contributor of the, society; one who leads with character and serves with a heart., Our success story strongly builds on the basis that it takes 'a village to raise a child'. We, firmly believe is collaborating closely with our key partners such as the members of, our School Management Committee, Alumni Association, Parent Support Group,, various community partners and parents to help our children in their development., Together with our dedicated teaching team and committed administrative and support, staff, we seek to give our best to nurture each child holistically and provide them with a, student-centric, values-driven education., Nan Hua Primary School distinguishes herself to be a school with rich traditions, a culture of excellence and one that prepares our children for the 21st century.
5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            St Hilda's Primary School established in 1934 is an Anglican Christian school with boys and girls from multi-ethnic backgrounds   Through generations of committed school leaders, staff, church workers and parents, the school has developed a culture of excellence and care.  One of the key school strengths is the emphasis on moral values and character development.  We have daily devotions and a strong Character and Leadership Development programme.  The school challenges our students to do well not only academically but also in their co-curricular activities.  Every Hildan through his or her six years of education will grow to be an Articulate, Confident, Innovative and Caring Ambassador.
  schoolInfo.motherTongues                  schoolInfo.affiliatedSchools
1                  Chinese                     nanyang-girls-high-school
2                  Chinese                                              
3                  Chinese catholic-junior-college, catholic-high-school
4                  Chinese                                              
5    Chinese, Malay, Tamil                   st-hilda-s-secondary-school
  admissionRank sportsRank artsRank uniformedGroupsRank
1            11          1        2                   3
2            19          7       18                   7
3             8         15        5                  40
4             1         77       47                 124
5             9         39       28                  47

4.2.2 Wrangle Dataset on Primary Schools’ Rankings

You may realise from above that primarySch_ranking that the there are quite a few columns prefixed with “schoolInfo$”, which mean that these columns are actually under the column schoolInfo in primarySch_ranking. Using unnest(), the columns inside schoolInfo will be expanded so that we can more easily access schools’ informations like names, addresses.

primarySch_ranking = unnest(primarySch_ranking, schoolInfo)[,c(1, 5, 9, 14:16, 22, 28:32)] 
head(primarySch_ranking)
# A tibble: 6 × 12
   rank school…¹ schoo…² address schoo…³ posta…⁴ schoo…⁵ affil…⁶ admis…⁷ sport…⁸
  <int> <chr>    <chr>   <chr>   <chr>   <chr>   <list>  <list>    <int>   <int>
1     1 Nanyang… West    52 Kin… Bukit … 268097  <chr>   <chr>        11       1
2     2 Tao Nan… East    49 Mar… Marine… 449761  <chr>   <chr>        19       7
3     3 Catholi… North   9 Bish… Bishan  579767  <chr>   <chr>         8      15
4     4 Nan Hua… West    30 Jal… Clemen… 128806  <chr>   <chr>         1      77
5     5 St. Hil… East    2 Tamp… Tampin… 529706  <chr>   <chr>         9      39
6     6 Henry P… West    1 Holl… Bukit … 278790  <chr>   <chr>        12       9
# … with 2 more variables: artsRank <int>, uniformedGroupsRank <int>, and
#   abbreviated variable names ¹​schoolName, ²​schoolZone, ³​schoolArea,
#   ⁴​postalCode, ⁵​schoolStatus, ⁶​affiliatedSchools, ⁷​admissionRank, ⁸​sportsRank

4.2.3 Merging Geospatial and Aspatial Data of Primary Schools

Using postal codes, we will merge them together!

primarySch_ranking_sf = merge(primarySch_ranking, primarySch_sf, by.x='postalCode', by.y="SCH_POSTAL_CODE")
head(primarySch_ranking_sf, 5)
  postalCode rank                  schoolName schoolZone
1     088256  182   Cantonment Primary School      South
2     099757  101              Chij (Kellock)      South
3     099840   27    Radin Mas Primary School      South
4     109100  168 Blangah Rise Primary School      South
5     128104   33         Qifa Primary School       West
                   address  schoolArea schoolStatus         affiliatedSchools
1       1 Cantonment Close Bukit Merah                                       
2      1 Bukit Teresa Road Bukit Merah              chij-st-theresa-s-convent
3    1 Bukit Purmei Avenue Bukit Merah                                       
4 91 Telok Blangah Heights Bukit Merah                                       
5     50 West Coast Avenue    Clementi                                       
  admissionRank sportsRank artsRank uniformedGroupsRank
1           132        153      163                 153
2           104         29       82                 163
3            10         74       83                  13
4           132        167      173                  61
5            61         26       10                 106
                   SCHOOLNAME SCH_HSE_BLK_NUM HSE_BLK_NUM         SCH_ROAD_NAME
1   CANTONMENT PRIMARY SCHOOL               1           1      CANTONMENT CLOSE
2              CHIJ (KELLOCK)               1           1     BUKIT TERESA ROAD
3    RADIN MAS PRIMARY SCHOOL               1           1   BUKIT PURMEI AVENUE
4 BLANGAH RISE PRIMARY SCHOOL              91          91 TELOK BLANGAH HEIGHTS
5         QIFA PRIMARY SCHOOL              50          50     WEST COAST AVENUE
                   geometry
1  POINT (28735.3 28695.24)
2 POINT (27413.63 28588.52)
3 POINT (26951.05 28619.79)
4 POINT (25215.83 28728.96)
5  POINT (19496.7 32798.72)

But sadly the merged sf has 180 rows while the original geospatial dataset has 181 rows, meaning there is a school with mismatch data.

4.2.3.0.1 Finding Mismatched Row

We will first order primarySch_ranking_sf according to their ranking.

primarySch__order_ranking_sf = primarySch_ranking_sf[order(primarySch_ranking_sf$rank),]

We then loop through the primarySch_ranking, the original DataFrame, to find the schools not in the merged sf.

Note that there are actually 186 rows in primarySch_ranking, meaning there is a difference of 6 rows.

schools_dont_exist = list()
for (i in 1:nrow(primarySch_ranking)){
  if (i %in% primarySch__order_ranking_sf$rank == FALSE){
    schools_dont_exist <- append(schools_dont_exist, i)
  }
}

schools <- data.frame()

for (sch in schools_dont_exist){
  schools = rbind(schools, primarySch_ranking[primarySch_ranking$rank == sch,])
}
schools
# A tibble: 6 × 12
   rank school…¹ schoo…² address schoo…³ posta…⁴ schoo…⁵ affil…⁶ admis…⁷ sport…⁸
  <int> <chr>    <chr>   <chr>   <chr>   <chr>   <list>  <list>    <int>   <int>
1   116 Pioneer… West    23 Jur… Jurong… 649076  <chr>   <chr>       132     103
2   119 Stamfor… South   1 Vict… Central 198423  <chr>   <chr>       132      42
3   134 Eunos P… East    95 Jal… Geylang 419529  <chr>   <chr>       132      88
4   137 Guangya… South   6 Bish… Bishan  579806  <chr>   <chr>       132     145
5   165 Juying … West    31 Jur… Jurong… 649037  <chr>   <chr>       132     116
6   171 Angsana… East    3 Tamp… Tampin… 529366  <chr>   <chr>       132     150
# … with 2 more variables: artsRank <int>, uniformedGroupsRank <int>, and
#   abbreviated variable names ¹​schoolName, ²​schoolZone, ³​schoolArea,
#   ⁴​postalCode, ⁵​schoolStatus, ⁶​affiliatedSchools, ⁷​admissionRank, ⁸​sportsRank

So… we need to look through all the current situations of all 6 schools and determine, which one is the one in OneMapSG’s geospatial data.

  • Pioneer Primary School merged in Juying Primary School in 2021. In addition, the merged school is moving to a new location in 2026, hence not opening from its Primary 1 Registration from 2021 to 2024
  • Stamford Primary School has merged with Farrer Park Primary School in 2023.
  • Eunos Primary School has merged with Telok Kurau Primary School.
  • Guangyang Primary School has merged with Townsville Primary School.

With that, Angsana Primary School is the row that has its postal code different.

Checking the school’s website, Angsana Primary School’s postal code is actually 528565. Hence we need to change primarySch_ranking’s postal code to facilitate merging.

primarySch_ranking[primarySch_ranking$schoolName == "Angsana Primary School","postalCode"] = '528565'

primarySch_ranking_df = merge(primarySch_ranking, primarySch_sf, by.x='postalCode', by.y="SCH_POSTAL_CODE")
nrow(primarySch_ranking_df)
[1] 181
primarySch_ranking_sf <- st_as_sf(primarySch_ranking_df) %>% st_transform(crs = 3414)

4.3 Masterplan Subzone 2019

Looking at the subzone dataset, it seems that there are invalid geometries in it.

length(which(st_is_valid(subzone_sf) == FALSE))
[1] 6

Thus, we will be using st_make_valid() to correct them.

subzone_sf <- st_make_valid(subzone_sf)
length(which(st_is_valid(subzone_sf) == FALSE))
[1] 0

4.4 MRT

Similarly for MRT, there are invalid geometries as well.

length(which(st_is_valid(mrt_sf) == FALSE))
[1] 2
mrt_sf[st_is_valid(mrt_sf) == FALSE,]
Simple feature collection with 3 features and 4 fields (with 1 geometry empty)
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 26568.45 ymin: 27478.44 xmax: 28080.89 ymax: 37543.25
Projected CRS: SVY21 / Singapore TM
    TYP_CD STN_NAM TYP_CD_DES                STN_NAM_DE
159      0    <NA>        MRT  HARBOURFRONT MRT STATION
NA      NA    <NA>       <NA>                      <NA>
218      0    <NA>        MRT UPPER THOMSON MRT STATION
                          geometry
159 POLYGON ((26736.44 27495.44...
NA                   POLYGON EMPTY
218 POLYGON ((27808.12 37518.2,...

However, some geometries have less than 4 polygons, which st_make_valid() does not resolve. Thus we will be excluding them from our exercise.

mrt_sf <- mrt_sf[st_is_valid(mrt_sf) == TRUE,]
mrt_sf <- mrt_sf[!st_is_empty(mrt_sf),,drop=FALSE]
length(which(st_is_valid(mrt_sf) == FALSE))
[1] 0

4.5 HDB

The sf DataFrame we are acquired from Data.gov.sg does not have any geospatial data. But it has the block and streetname that we can derived the data from, through OpenMapSG API.

library(httr)
geocode <- function(block, streetname="") {
  base_url <- "https://developers.onemap.sg/commonapi/search"
  address <- paste(block, streetname, sep = " ")
  query <- list("searchVal" = address, 
                "returnGeom" = "Y",
                "getAddrDetails" = "N",
                "pageNum" = "1")
  
  res <- GET(base_url, query = query)
  restext<-content(res, as="text")
  
  output <- fromJSON(restext)  %>% 
    as.data.frame %>%
    select(results.LATITUDE, results.LONGITUDE)

  return(output)
}
resale_prices_total$LATITUDE <- 0
resale_prices_total$LONGITUDE <- 0

for (i in 1:nrow(resale_prices_total)){
  temp_output <- geocode(resale_prices_total[i, 4], resale_prices_total[i, 5])
  
  resale_prices_total$LATITUDE[i] <- temp_output$results.LATITUDE
  resale_prices_total$LONGITUDE[i] <- temp_output$results.LONGITUDE
}
resale_prices_sf <- st_as_sf(resale_prices_total, coords=c("LONGITUDE", "LATITUDE"), crs=4326)
st_write(resale_prices_sf, "data/geospatial/resale_prices/resale_prices.shp")
resale_prices_sf <- st_read(dsn="data/geospatial/resale_prices", layer="resale_prices") %>% st_transform(crs = 3414)
Reading layer `resale_prices' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\resale_prices' 
  using driver `ESRI Shapefile'
Simple feature collection with 60218 features and 12 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.6442 ymin: 1.27038 xmax: 103.9878 ymax: 1.457071
Geodetic CRS:  WGS 84

4.5.1 Merge with HDB Status

We also have data on the upgrades done on these buildings. It will be good to separate the both two types of upgrades.

Note that MUP is actually then the precessor of HIP before 2007. Differentiating the upgrades then ables us to see if the timing of upgrades has an impact on the resell prices.

4.5.1.1 Home Improvement Programme (HIP)

HIP is described as HDB’s programme to “resolve common maintenance problems of ageing flats”.

hdb_hip <- hdb_upgrades %>% filter(TYPE == "HIP")
colnames(hdb_hip)[3] ="HIP"
resale_prices_upgrade_sf <- merge(resale_prices_sf, hdb_hip, by.x=c("block", "strt_nm"), by.y=c("BLK", "STREET"), all.x = TRUE)

4.5.1.2 Main Upgrading Programme (MUP)

Before 2007, MUP was dedicated to do the same things.

hdb_mup <- hdb_upgrades %>% filter(TYPE == "MUP")
colnames(hdb_mup)[3] ="MUP"
resale_prices_upgrade_sf <- merge(resale_prices_upgrade_sf, hdb_mup, by.x=c("block", "strt_nm"), by.y=c("BLK", "STREET"), all.x = TRUE)

4.6 Malls

Finally, we have the mall’s data from Wikipedia. Similar to the HDB flats, we don’t have geospatial data. But thankfully, we can do the same thing using OneMapSG to get the coordinates.

malls$LATITUDE <- 0
malls$LONGITUDE <- 0

for (i in 1:nrow(malls)){
  temp_output <- geocode(malls[i, 1], )
  
  malls$LATITUDE[i] <- temp_output$results.LATITUDE
  malls$LONGITUDE[i] <- temp_output$results.LONGITUDE
}
malls_sf <- st_as_sf(malls, coords=c("LONGITUDE", "LATITUDE"), crs=4326)
st_write(malls_sf, "data/geospatial/malls/malls.shp")
malls_sf <- st_read(dsn="data/geospatial/malls", layer="malls") %>% st_transform(crs = 3414)
Reading layer `malls' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\malls' 
  using driver `ESRI Shapefile'
Simple feature collection with 167 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 103.6784 ymin: 1.263932 xmax: 103.9895 ymax: 1.448194
Geodetic CRS:  WGS 84

4.7 Top Primary Schools

It is difficult to really determine what is the appropriate number of top schools to filter for, since schools can very much have their own specialties and histories (like alumni communities) that can affect the data, outside of the ones used by the website we scrapped the ranking data from.

Hence top 20 is admittedly derived after I looked through the dataset and judged that the top 20 schools are really known to be some of the best primary schools, which sentiments I believe will be similar among the general Singapore population.

top_primarySch_ranking_sf <- primarySch_ranking_sf %>% filter(rank <= 20)

5 Proximity

Function to acquire proximity

proximity_calculator <- function(original_sf, derived_sf, column_name) {
  dist_matrix <- st_distance(original_sf, derived_sf)
  original_sf[, column_name] <- rowMins(dist_matrix) / 1000
  return(original_sf)
}
resale_proximity_sf <- resale_prices_upgrade_sf 
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = cbd_sf, column_name = "PROX_CBD")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = hawkercentre_sf, column_name="PROX_HAWKER")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = eldercare_sf, column_name = "PROX_ELDERCARE")
proximity_calculator(original_sf = resale_proximity_sf, derived_sf = parks_sf, column_name="PROX_PARK")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = kindergartens_sf, column_name = "PROX_KINDERGARTEN")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = childcare_sf, column_name = "PROX_CHILDCARE")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = busstop_sf, column_name = "PROX_BUSSTOP")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = mrt_sf, column_name = "PROX_MRT")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = primarySch_ranking_sf, column_name = "PROX_SCH")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = supermarkets_sf, column_name="PROX_SUPERMARKET")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = malls_sf, column_name = "PROX_MALL")
resale_proximity_sf <- proximity_calculator(original_sf = resale_proximity_sf, derived_sf = top_primarySch_ranking_sf, column_name = "PROX_TOPSCH")

5.1 Within radius

library(units)
radius_calculator <- function(original_sf, derived_sf, column_name, radius) {
  dist_matrix <- st_distance(original_sf, derived_sf) %>%
    drop_units() %>%
    as.data.frame()
  original_sf[,column_name] <- rowSums(dist_matrix <= radius)
  return(original_sf)
}
resale_proximity_sf <- radius_calculator(resale_proximity_sf, kindergartens_sf, "NUM_KINDERGARTEN", 350)
resale_proximity_sf <- radius_calculator(resale_proximity_sf, childcare_sf, "NUM_CHILDCARE", 350)
resale_proximity_sf <- radius_calculator(resale_proximity_sf, busstop_sf, "NUM_BUSSTOP", 350)
resale_proximity_sf <- radius_calculator(resale_proximity_sf, primarySch_ranking_sf, "NUM_SCH", 1000)
resale_proximity_sf <- radius_calculator(resale_proximity_sf, top_primarySch_ranking_sf, "NUM_TOPSCH", 1000)
st_write(resale_proximity_sf, "data/geospatial/resale_proximity/resale_proximity.shp")
resale_proximity_sf <- st_read(dsn="data/geospatial/resale_proximity", layer="resale_proximity")
Reading layer `resale_proximity' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\resale_proximity' 
  using driver `ESRI Shapefile'
Simple feature collection with 60218 features and 32 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 6958.193 ymin: 28097.64 xmax: 45192.3 ymax: 48741.06
Projected CRS: SVY21 / Singapore TM

6 Get Proximity Locational Factors

6.1 Floor Level

storey_category <- unique(resale_proximity_sf$stry_rn)
storey_category
 [1] "07 TO 09" "16 TO 18" "01 TO 03" "13 TO 15" "10 TO 12" "04 TO 06"
 [7] "19 TO 21" "22 TO 24" "25 TO 27" "34 TO 36" "31 TO 33" "37 TO 39"
[13] "40 TO 42" "28 TO 30" "43 TO 45" "49 TO 51" "46 TO 48"
storey_names <- 1:length(storey_category)
storey_order <- data.frame(storey_category, storey_names)
storey_order
   storey_category storey_names
1         07 TO 09            1
2         16 TO 18            2
3         01 TO 03            3
4         13 TO 15            4
5         10 TO 12            5
6         04 TO 06            6
7         19 TO 21            7
8         22 TO 24            8
9         25 TO 27            9
10        34 TO 36           10
11        31 TO 33           11
12        37 TO 39           12
13        40 TO 42           13
14        28 TO 30           14
15        43 TO 45           15
16        49 TO 51           16
17        46 TO 48           17
resale_level_loc_sf <- merge(resale_proximity_sf, storey_order, by.x = "stry_rn", by.y = "storey_category", all.x=TRUE)
resale_level_loc_sf
Simple feature collection with 60218 features and 33 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 6958.193 ymin: 28097.64 xmax: 45192.3 ymax: 48741.06
Projected CRS: SVY21 / Singapore TM
First 10 features:
    stry_rn block          strt_nm   month            town   flt_typ flr_r_s
1  01 TO 03  117B     JLN TENTERAM 2021-08 KALLANG/WHAMPOA    4 ROOM      93
2  01 TO 03  101A  LOR 2 TOA PAYOH 2021-10       TOA PAYOH EXECUTIVE     145
3  01 TO 03    57  LOR 5 TOA PAYOH 2021-08       TOA PAYOH    3 ROOM      61
4  01 TO 03  305A  ANCHORVALE LINK 2021-06        SENGKANG    5 ROOM     110
5  01 TO 03   237    HOUGANG ST 21 2022-05         HOUGANG EXECUTIVE     148
6  01 TO 03   538 ANG MO KIO AVE 5 2023-02      ANG MO KIO    3 ROOM      68
7  01 TO 03   358   CLEMENTI AVE 2 2021-07        CLEMENTI    4 ROOM      92
8  01 TO 03   461         SEGAR RD 2021-02   BUKIT PANJANG    4 ROOM      93
9  01 TO 03  424D    YISHUN AVE 11 2022-11          YISHUN    3 ROOM      67
10 01 TO 03   401   HOUGANG AVE 10 2021-07         HOUGANG    3 ROOM      68
             flt_mdl ls_cmm_            rmnng_l rsl_prc       date  HIP
1            Model A    2017 95 years 03 months  655000 2021-08-01 <NA>
2          Apartment    1993 70 years 07 months  898000 2021-10-01 <NA>
3           Standard    1973 50 years 06 months  260000 2021-08-01 <NA>
4           Improved    2002 79 years 09 months  445000 2021-06-01 <NA>
5         Maisonette    1984 61 years 03 months  760000 2022-05-01 <NA>
6     New Generation    1980 56 years 02 months  358000 2023-02-01  HIP
7     New Generation    1978 55 years 07 months  422000 2021-07-01  HIP
8  Premium Apartment    2015 93 years 07 months  390000 2021-02-01 <NA>
9            Model A    2015 92 years 03 months  360000 2022-11-01 <NA>
10    New Generation    1985 62 years 11 months  325000 2021-07-01  HIP
       TOWN_x  MUP    TOWN_y PROX_CB PROX_HA PROX_EL PROX_KI PROX_CH PROX_BU
1        <NA> <NA>      <NA>   0.013   0.122   0.115   0.371   0.799   2.842
2        <NA> <NA>      <NA>   0.012   0.114   0.033   0.422   1.905   2.629
3        <NA>  MUP TOA PAYOH   0.012   0.115   0.033   0.423   0.902   0.934
4        <NA> <NA>      <NA>   0.013   0.008   0.066   0.272   1.102   0.465
5        <NA> <NA>      <NA>   0.013   0.079   0.042   0.113   1.366   4.606
6  ANG MO KIO <NA>      <NA>   0.013   0.020   0.013   0.002   1.797   0.824
7    CLEMENTI <NA>      <NA>   0.003   0.047   0.103   0.007   0.937   1.001
8        <NA> <NA>      <NA>   0.012   0.038   0.045   0.255   1.691   0.739
9        <NA> <NA>      <NA>   0.013   0.124   0.072   0.049   0.891   2.761
10    HOUGANG <NA>      <NA>   0.013   0.008   0.095   0.107   1.587   2.062
   PROX_MR PROX_SC PROX_SU PROX_MA PROX_TO NUM_KIN NUM_CHI NUM_BUS NUM_SCH
1    0.075   0.033   0.074   0.057   0.007       0       1       6       3
2    0.037   0.028   0.232   0.019   0.007       2       4      10       5
3    0.037   0.028   0.232   0.019   0.007       1       5      11       3
4    0.129   0.082   0.433   0.114   0.015       0       4       8       4
5    0.156   0.079   0.317   0.116   0.014       1       4       8       2
6    0.056   0.094   0.040   0.098   0.016       1       3      10       1
7    0.044   0.007   0.159   0.161   0.001       1       3       9       2
8    0.077   0.127   0.330   0.131   0.019       2       8       6       4
9    0.086   0.165   0.502   0.106   0.016       0       2       6       3
10   0.008   0.072   0.252   0.111   0.014       3       6       9       5
   NUM_TOP storey_names                  geometry
1        0            3 POINT (31058.62 34487.27)
2        1            3     POINT (29552 35767.3)
3        1            3 POINT (30022.91 35470.15)
4        0            3 POINT (34103.93 41148.01)
5        0            3 POINT (34180.87 37633.89)
6        0            3    POINT (30229 39731.65)
7        0            3 POINT (20851.34 32875.25)
8        0            3 POINT (21095.91 40990.18)
9        0            3  POINT (29769.97 44954.8)
10       1            3 POINT (35086.31 39648.81)

6.2 Remaining Lease

str_list <- str_split(resale_level_loc_sf$rmnng_l, " ")

for (i in 1:length(str_list)) {
  if (length(unlist(str_list[i])) > 2) {
      year <- as.numeric(unlist(str_list[i])[1])
      month <- as.numeric(unlist(str_list[i])[3])
      resale_level_loc_sf[i, "remaining_lease"] <- (year * 12) + month
  }
  else {
    year <- as.numeric(unlist(str_list[i])[1])
    resale_level_loc_sf[i, "remaining_lease"] <- (year * 12)
  }
}
resale_remaining_lease_loc_sf <- resale_level_loc_sf
st_write(resale_remaining_lease_loc_sf, "data/geospatial/resale_proximity/resale_proximity.shp")
resale_remaining_lease_loc_sf <- st_read(dsn="data/geospatial/resale_proximity", layer="resale_proximity") %>% st_transform(crs = 3414)
Reading layer `resale_proximity' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\resale_proximity' 
  using driver `ESRI Shapefile'
Simple feature collection with 60218 features and 32 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 6958.193 ymin: 28097.64 xmax: 45192.3 ymax: 48741.06
Projected CRS: SVY21 / Singapore TM

6.3 Age of Unit

resale_remaining_lease_loc_sf$year <- zoo::as.Date(zoo::as.yearmon(resale_remaining_lease_loc_sf$ls_cmm_))
resale_remaining_lease_loc_sf$age = round(interval(resale_remaining_lease_loc_sf$year, resale_remaining_lease_loc_sf$date)/ years(1))
resale_age_loc_sf <- resale_remaining_lease_loc_sf

6.4 Upgrades

resale_upgrade_loc_sf <- resale_age_loc_sf
resale_upgrade_loc_sf$HIP <- ifelse(is.na(resale_upgrade_loc_sf$HIP), 0, 1)
resale_upgrade_loc_sf$MUP <- ifelse(is.na(resale_upgrade_loc_sf$MUP), 0, 1)
st_write(resale_remaining_lease_loc_sf, "data/geospatial/resale_upgrade_loc_sf/resale_upgrade_loc_sf.shp")
resale_proximity_sf <- st_read(dsn="data/geospatial/resale_upgrade_loc_sf", layer="resale_upgrade_loc_sf")
Reading layer `resale_upgrade_loc_sf' from data source 
  `C:\deadline2359\IS415-GAA\Take-home_Ex\Take-home_Ex03\data\geospatial\resale_upgrade_loc_sf' 
  using driver `ESRI Shapefile'
Simple feature collection with 60218 features and 32 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 6958.193 ymin: 28097.64 xmax: 45192.3 ymax: 48741.06
Projected CRS: SVY21 / Singapore TM

6.5 HDB Types

hdb_types <- unique(resale_upgrade_loc_sf$flt_typ)
hdb_types
[1] "3 ROOM"           "5 ROOM"           "4 ROOM"           "EXECUTIVE"       
[5] "2 ROOM"           "MULTI-GENERATION" "1 ROOM"          
resale_type_loc_sf <- resale_upgrade_loc_sf %>% filter(resale_upgrade_loc_sf$flt_typ == "3 ROOM")
resale_type_loc_sf
Simple feature collection with 13779 features and 34 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 11597.97 ymin: 28097.64 xmax: 45192.3 ymax: 48622.47
Projected CRS: SVY21 / Singapore TM
First 10 features:
   block         strt_nm   month            town flt_typ  stry_rn flr_r_s
1      1        BEACH RD 2021-02 KALLANG/WHAMPOA  3 ROOM 07 TO 09      68
2      1        BEACH RD 2021-03 KALLANG/WHAMPOA  3 ROOM 07 TO 09      68
3      1        BEACH RD 2021-07 KALLANG/WHAMPOA  3 ROOM 16 TO 18      68
4      1        BEACH RD 2022-12 KALLANG/WHAMPOA  3 ROOM 07 TO 09      68
5      1        BEACH RD 2022-07 KALLANG/WHAMPOA  3 ROOM 01 TO 03      68
6      1        BEACH RD 2021-04 KALLANG/WHAMPOA  3 ROOM 07 TO 09      68
7      1 BEDOK STH AVE 1 2021-01           BEDOK  3 ROOM 07 TO 09      59
8      1 BEDOK STH AVE 1 2021-04           BEDOK  3 ROOM 01 TO 03      59
9      1 BEDOK STH AVE 1 2021-05           BEDOK  3 ROOM 13 TO 15      59
10     1 BEDOK STH AVE 1 2021-09           BEDOK  3 ROOM 10 TO 12      67
    flt_mdl ls_cmm_            rmnng_l rsl_prc       date  HIP TOWN_x  MUP
1  Improved    1979 57 years 08 months  328000 2021-02-01 <NA>   <NA>  MUP
2  Improved    1979 57 years 07 months  350888 2021-03-01 <NA>   <NA>  MUP
3  Improved    1979 57 years 03 months  355000 2021-07-01 <NA>   <NA>  MUP
4  Improved    1979           56 years  418000 2022-12-01 <NA>   <NA>  MUP
5  Improved    1979 56 years 03 months  350000 2022-07-01 <NA>   <NA>  MUP
6  Improved    1979 57 years 05 months  328000 2021-04-01 <NA>   <NA>  MUP
7  Improved    1976  54 years 01 month  268000 2021-01-01 <NA>   <NA> <NA>
8  Improved    1976 53 years 10 months  265000 2021-04-01 <NA>   <NA> <NA>
9  Improved    1976 53 years 08 months  301000 2021-05-01 <NA>   <NA> <NA>
10 Improved    1976 53 years 05 months  322000 2021-09-01 <NA>   <NA> <NA>
            TOWN_y PROX_CB PROX_HA PROX_EL PROX_KI PROX_CH PROX_BU PROX_MR
1  KALLANG/WHAMPOA   0.011   0.105   0.024   0.297   0.510   4.712   0.001
2  KALLANG/WHAMPOA   0.011   0.105   0.024   0.297   0.510   4.712   0.001
3  KALLANG/WHAMPOA   0.011   0.105   0.024   0.297   0.510   4.712   0.001
4  KALLANG/WHAMPOA   0.011   0.105   0.024   0.297   0.510   4.712   0.001
5  KALLANG/WHAMPOA   0.011   0.105   0.024   0.297   0.510   4.712   0.001
6  KALLANG/WHAMPOA   0.011   0.105   0.024   0.297   0.510   4.712   0.001
7             <NA>   0.013   0.029   0.028   0.324   0.727   1.535   0.040
8             <NA>   0.013   0.029   0.028   0.324   0.727   1.535   0.040
9             <NA>   0.013   0.029   0.028   0.324   0.727   1.535   0.040
10            <NA>   0.013   0.029   0.028   0.324   0.727   1.535   0.040
   PROX_SC PROX_SU PROX_MA PROX_TO NUM_KIN NUM_CHI NUM_BUS NUM_SCH NUM_TOP
1    0.017   0.418   0.042   0.006       0       2      15       0       0
2    0.017   0.418   0.042   0.006       0       2      15       0       0
3    0.017   0.418   0.042   0.006       0       2      15       0       0
4    0.017   0.418   0.042   0.006       0       2      15       0       0
5    0.017   0.418   0.042   0.006       0       2      15       0       0
6    0.017   0.418   0.042   0.006       0       2      15       0       0
7    0.047   0.321   0.065   0.011       0       4       5       1       0
8    0.047   0.321   0.065   0.011       0       4       5       1       0
9    0.047   0.321   0.065   0.011       0       4       5       1       0
10   0.047   0.321   0.065   0.011       0       4       5       1       0
                    geometry       year age
1  POINT (30309.27 30830.82) 1979-01-01  42
2  POINT (30309.27 30830.82) 1979-01-01  42
3  POINT (30309.27 30830.82) 1979-01-01  42
4  POINT (30309.27 30830.82) 1979-01-01  44
5  POINT (30309.27 30830.82) 1979-01-01  43
6  POINT (30309.27 30830.82) 1979-01-01  42
7  POINT (39173.81 33678.86) 1976-01-01  45
8  POINT (39173.81 33678.86) 1976-01-01  45
9  POINT (39173.81 33678.86) 1976-01-01  45
10 POINT (39173.81 33678.86) 1976-01-01  46

6.6 save

final_resale_prices <- resale_type_loc_sf
final_resale_prices <- final_resale_prices %>% 
        rename("storey_range" = "stry_rn",
               "street_name" = "strt_nm",
               "flat_type" = "flt_typ",
               "floor_area_sqm" = "flr_r_s",
               "flat_model" = "flt_mdl",
               "lease_commence_date" = "ls_cmm_",
               "remaining_lease_original" = "rmnng_l",
               "resale_price" = "rsl_prc",
               "PROX_CBD" = "PROX_CB",
               "PROX_HAWKER" = "PROX_HA",
               "PROX_ELDERCARE" = "PROX_EL",
               "PROX_KINDERGARTEN" = "PROX_KI",
               "PROX_CHILDCARE" = "PROX_CH",
               "PROX_BUSSTOP" = "PROX_BU",
               "PROX_MRT" = "PROX_MR",
               "PROX_SCH" = "PROX_SC",
               "PROX_TOPSCH" = "PROX_TO",
               "PROX_SUPERMARKET" = "PROX_SU",
               "PROX_MALL" = "PROX_MA",
               "NUM_KINDERGARTEN" = "NUM_KIN",
               "NUM_CHILDCARE" = "NUM_CHI",
               "NUM_BUSSTOP" = "NUM_BUS",
               "NUM_TOPSCH" = "NUM_TOP")
names(final_resale_prices)
 [1] "block"                    "street_name"             
 [3] "month"                    "town"                    
 [5] "flat_type"                "storey_range"            
 [7] "floor_area_sqm"           "flat_model"              
 [9] "lease_commence_date"      "remaining_lease_original"
[11] "resale_price"             "date"                    
[13] "HIP"                      "TOWN_x"                  
[15] "MUP"                      "TOWN_y"                  
[17] "PROX_CBD"                 "PROX_HAWKER"             
[19] "PROX_ELDERCARE"           "PROX_KINDERGARTEN"       
[21] "PROX_CHILDCARE"           "PROX_BUSSTOP"            
[23] "PROX_MRT"                 "PROX_SCH"                
[25] "PROX_SUPERMARKET"         "PROX_MALL"               
[27] "PROX_TOPSCH"              "NUM_KINDERGARTEN"        
[29] "NUM_CHILDCARE"            "NUM_BUSSTOP"             
[31] "NUM_SCH"                  "NUM_TOPSCH"              
[33] "year"                     "age"                     
[35] "geometry"                
write_rds(final_resale_prices, "data/model/final_resale_prices.rds")
final_resale_prices <- read_rds("data/model/final_resale_prices.rds")

7 Exploratory Data Analysis

7.1 Statistical Graphics

7.1.1 Central Business District

tm_shape(cbd_sf) +
  tm_fill("SUBZONE_N") +
  tm_borders(alpha = 0.5) +
  tm_layout(main.title = "Central Business District Map",
            main.title.size = 1.10,
            main.title.position = "center",
            legend.outside = TRUE,
            legend.outside.position = "left")

head(cbd_sf, 5)
Simple feature collection with 5 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 28896.26 ymin: 27914.19 xmax: 30794.28 ymax: 29517.49
Projected CRS: SVY21 / Singapore TM
        SUBZONE_N SUBZONE_C    PLN_AREA_N PLN_AREA_C       REGION_N REGION_C
1           ANSON    DTSZ10 DOWNTOWN CORE         DT CENTRAL REGION       CR
2   TANJONG PAGAR    DTSZ09 DOWNTOWN CORE         DT CENTRAL REGION       CR
3         MAXWELL    DTSZ07 DOWNTOWN CORE         DT CENTRAL REGION       CR
4           CECIL    DTSZ08 DOWNTOWN CORE         DT CENTRAL REGION       CR
5 CENTRAL SUBZONE    DTSZ11 DOWNTOWN CORE         DT CENTRAL REGION       CR
                        geometry
1 MULTIPOLYGON (((29201.07 28...
2 MULTIPOLYGON (((29612.3 286...
3 MULTIPOLYGON (((29341.73 29...
4 MULTIPOLYGON (((29917.52 29...
5 MULTIPOLYGON (((30436.73 29...

7.1.2 Hawker Centres

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(hawkercentre_sf) + 
  tm_dots(size = 0.15,
          shape = 22,
          col = "#4287f5") +
  tm_layout(main.title = "Supermarkets",
            main.title.size = 1.10,
            main.title.position = "center",)

head(hawkercentre_sf, 5)
Simple feature collection with 5 features and 18 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 22042.51 ymin: 29650.7 xmax: 35955.52 ymax: 46139.03
Projected CRS: SVY21 / Singapore TM
                             NAME                DESCRIP
1     Market Street Hawker Centre New Replacement Centre
2    Marsiling Mall Hawker Centre New Replacement Centre
3    Margaret Drive Hawker Centre New Replacement Centre
4 Fernvale Hawker Centre & Market             New Centre
5       One Punggol Hawker Centre             New Centre
                                        ADDRESSBU ADDRESSBL ADDRESSP
1                                    CapitaSpring        50    48940
2                                  Marsiling Mall         4   738623
3                                            <NA>       38A   142038
4 Fernvale Community Club, Hawker Centre & Market        21   797650
5                                     One Punggol         1   828629
              ADDRESSS  LANDXAD  LANDYAD                 STATUS   AWARDED
1        Market Street 29874.82  29650.7 Existing (replacement) 17/7/2017
2  Woodlands Street 12 22042.51 46139.03 Existing (replacement)  4/9/2015
3       Margaret Drive  24816.7 31094.91 Existing (replacement) 11/5/2015
4 Sengkang West Avenue  32867.9 41500.77         Existing (new) 30/7/2018
5        Punggol Drive 35955.52 43336.13         Existing (new)  8/8/2018
    IMPLEME
1  1/8/2017
2  1/1/2016
3 2/12/2015
4 13/8/2018
5 29/8/2018
                                                                                                                                                   INFO_ON
1                                                                                                                           hawker centre/commercial space
2                                                                                                                          hawker centre/supermarket/shops
3                                                                                                      hawker centre/branch office/supermarket/shops/flats
4                                                                                                                      hawker centre/market/community club
5 hawker centre/Regional Library/Senior Care Centre/Branch Office/Child Development Centre/Blood Collection Centre/Kidney Dialysis Centre/Childcare Centre
     EST_ORI            ICON_NA
1 15/11/2021 HC icons_Opt 8.jpg
2  8/11/2017 HC icons_Opt 8.jpg
3  30/3/2022 HC icons_Opt 8.jpg
4  22/3/2022 HC icons_Opt 8.jpg
5  29/4/2022 HC icons_Opt 8.jpg
                                                                       PHOTOUR
1                                                                         <NA>
2 http://www.nea.gov.sg/images/default-source/Hawker-Centres-Division/mmhc.jpg
3                                                                         <NA>
4                                                                         <NA>
5                                                                         <NA>
                                ADDRESS_ APPROXI HUP_COM
1                                   <NA>    <NA>    <NA>
2 Blk 4,Woodlands St 12,Singapore?738623    3661    <NA>
3                                   <NA>    <NA>    <NA>
4                                   <NA>    <NA>    <NA>
5                                   <NA>    <NA>    <NA>
                   geometry
1  POINT (29874.82 29650.7)
2 POINT (22042.51 46139.03)
3  POINT (24816.7 31094.91)
4  POINT (32867.9 41500.77)
5 POINT (35955.52 43336.13)

7.1.3 Eldercare Centres

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(eldercare_sf) + 
  tm_dots(size = 0.15,
          shape = 24,
          col = "#fcba03") +
  tm_layout(main.title = "Eldercare Centres",
            main.title.size = 1.10,
            main.title.position = "center",)

head(eldercare_sf, 5)
Simple feature collection with 5 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 14481.92 ymin: 30135.49 xmax: 38803.81 ymax: 36639.12
Projected CRS: SVY21 / Singapore TM
                                    NAME ADDRESSP
1           Yuhua Senior Activity Centre   601318
2                   THK SAC @ Kaki Bukit   462509
3                     THK SAC @ Boon Lay   640190
4 PEACE-Connect Senior Activity Centre@5   190005
5                 THK SAC @ Beo Crescent   160044
                               ADDRESSS                 ICON_NA
1     318A Jurong East Avenue 1 #02-308 onemap-fc-eldercare.png
2 Blk 509B Bedok North Street 3 #02-157 onemap-fc-eldercare.png
3        Blk 190 Boon Lay Drive #01-242 onemap-fc-eldercare.png
4                   5 Beach Rd #02-4943 onemap-fc-eldercare.png
5            Blk 44 Beo Crescent #01-67 onemap-fc-eldercare.png
                   geometry
1 POINT (16614.08 36639.12)
2 POINT (38803.81 35098.77)
3 POINT (14481.92 36357.61)
4 POINT (31505.35 31853.52)
5 POINT (27218.35 30135.49)

7.1.4 Parks

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(parks_sf) + 
  tm_dots(size = 0.15,
          shape = 23,
          col = "#32a852") +
  tm_layout(main.title = "Parks",
            main.title.size = 1.10,
            main.title.position = "center")

head(eldercare_sf, 5)
Simple feature collection with 5 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 14481.92 ymin: 30135.49 xmax: 38803.81 ymax: 36639.12
Projected CRS: SVY21 / Singapore TM
                                    NAME ADDRESSP
1           Yuhua Senior Activity Centre   601318
2                   THK SAC @ Kaki Bukit   462509
3                     THK SAC @ Boon Lay   640190
4 PEACE-Connect Senior Activity Centre@5   190005
5                 THK SAC @ Beo Crescent   160044
                               ADDRESSS                 ICON_NA
1     318A Jurong East Avenue 1 #02-308 onemap-fc-eldercare.png
2 Blk 509B Bedok North Street 3 #02-157 onemap-fc-eldercare.png
3        Blk 190 Boon Lay Drive #01-242 onemap-fc-eldercare.png
4                   5 Beach Rd #02-4943 onemap-fc-eldercare.png
5            Blk 44 Beo Crescent #01-67 onemap-fc-eldercare.png
                   geometry
1 POINT (16614.08 36639.12)
2 POINT (38803.81 35098.77)
3 POINT (14481.92 36357.61)
4 POINT (31505.35 31853.52)
5 POINT (27218.35 30135.49)

7.1.5 Kindergartens

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(kindergartens_sf) + 
  tm_dots(size = 0.15,
          shape = 21,
          col = "#DE60B5") +
  tm_layout(main.title = "Kindergartens",
            main.title.size = 1.10,
            main.title.position = "center")

head(kindergartens_sf, 5)
Simple feature collection with 5 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 27354.73 ymin: 38859.25 xmax: 30705.05 ymax: 46300.26
Projected CRS: SVY21 / Singapore TM
                                                        NAME       DESCRIP
1 PCF Sparkletots Preschool @ Cheng San-Seletar Blk 435 (KN) Kindergartens
2 PCF Sparkletots Preschool @ Cheng San-Seletar Blk 533 (KN) Kindergartens
3 PCF Sparkletots Preschool @ Cheng San-Seletar Blk 556 (DS) Kindergartens
4        PCF Sparkletots Preschool @ Chong Pang Blk 107 (KN) Kindergartens
5        PCF Sparkletots Preschool @ Chong Pang Blk 122 (KN) Kindergartens
  ADDRESSP                                     ADDRESSS    ICON_NA
1   560435 435 Ang Mo Kio Avenue 10  #01-1393 S(560435) school.gif
2   560533  533 Ang Mo Kio Avenue 5  #01-4100 S(560533) school.gif
3   560556 556 Ang Mo Kio Avenue 10  #01-1902 S(560556) school.gif
4   760107      107 Yishun Ring Road  #01-207 S(760107) school.gif
5   760122      122 Yishun Street 11  #01-479 S(760122) school.gif
                   geometry
1 POINT (30325.45 38859.25)
2 POINT (30190.51 39574.18)
3  POINT (30705.05 39337.9)
4 POINT (27354.73 45992.92)
5 POINT (27755.87 46300.26)

7.1.6 Childcare Centres

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(childcare_sf) + 
  tm_dots(size = 0.15,
          shape = 25,
          col = "#AB66C6") +
  tm_layout(main.title = "Childcare Centres",
            main.title.size = 1.10,
            main.title.position = "center")

head(childcare_sf, 5)
Simple feature collection with 5 features and 5 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 17828.84 ymin: 29221.89 xmax: 40985.94 ymax: 45530.47
Projected CRS: SVY21 / Singapore TM
                                            NAME             DESCRIP ADDRESSP
1 APOLLO INTERNATIONAL PRESCHOOL PRIVATE LIMITED Child Care Services   467903
2                   APPLE TREE PLAYHOUSE PTE LTD Child Care Services   768019
3 Appleland Montessori Child Care Centre Pte Ltd Child Care Services   650165
4                            APPLELAND PLAYHOUSE Child Care Services   103104
5             APRICOT ACADEMY (LAGUNA) PTE. LTD. Child Care Services   449290
                                                               ADDRESSS
1                        44, LIMAU GARDEN, BEDOK PARK, SINGAPORE 467903
2     1, NORTHPOINT DRIVE, #02 - 201, NORTHPOINT CITY, SINGAPORE 768019
3           165, BUKIT BATOK WEST AVENUE 8, #01 - 286, SINGAPORE 650165
4                          104C, DEPOT ROAD, #01 - 03, SINGAPORE 103104
5 5000G, MARINE PARADE ROAD, #01 - 28/30, LAGUNA PARK, SINGAPORE 449290
                  ICON_NA                  geometry
1 onemap-fc-childcare.png POINT (40985.94 33848.38)
2 onemap-fc-childcare.png POINT (28308.65 45530.47)
3 onemap-fc-childcare.png POINT (17828.84 36607.36)
4 onemap-fc-childcare.png POINT (25579.73 29221.89)
5 onemap-fc-childcare.png POINT (38981.02 32483.41)

7.1.7 Bus Stops

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(busstop_sf) + 
  tm_dots(size = 0.15,
          shape = 21,
          col = "#CABE3E") +
  tm_layout(main.title = "Bus Stops",
            main.title.size = 1.10,
            main.title.position = "center")

head(busstop_sf, 5)
Simple feature collection with 5 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 13228.59 ymin: 30391.85 xmax: 41603.76 ymax: 44206.38
Projected CRS: SVY21 / Singapore TM
  BUS_STOP_N BUS_ROOF_N             LOC_DESC                  geometry
1      22069        B06   OPP CEVA LOGISTICS POINT (13576.31 32883.65)
2      32071        B23         AFT TRACK 13 POINT (13228.59 44206.38)
3      44331        B01              BLK 239  POINT (21045.1 40242.08)
4      96081        B05 GRACE INDEPENDENT CH POINT (41603.76 35413.11)
5      11561        B05              BLK 166 POINT (24568.74 30391.85)

7.1.8 MRT Stations

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(mrt_sf) + 
  tm_dots(size = 0.15,
          shape = 22,
          col = "#FE9C1D") +
  tm_layout(main.title = "MRT Stations",
            main.title.size = 1.10,
            main.title.position = "center")

head(mrt_sf, 5)
Simple feature collection with 5 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 29286.74 ymin: 30548.91 xmax: 34623.54 ymax: 33404.47
Projected CRS: SVY21 / Singapore TM
  TYP_CD STN_NAM TYP_CD_DES              STN_NAM_DE
1      0    <NA>        MRT   ESPLANADE MRT STATION
2      0    <NA>        MRT  PAYA LEBAR MRT STATION
3      0    <NA>        MRT DHOBY GHAUT MRT STATION
4      0    <NA>        MRT      DAKOTA MRT STATION
5      0    <NA>        MRT    LAVENDER MRT STATION
                        geometry
1 POLYGON ((30566.07 30621.21...
2 POLYGON ((34495.6 33384.44,...
3 POLYGON ((29293.51 31312.53...
4 POLYGON ((34055.08 32290.62...
5 POLYGON ((31236.5 32085.76,...

7.1.9 Primary Schools

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(primarySch_ranking_sf) + 
  tm_dots(size = 0.15,
          shape = 23,
          col = "#BFBD9C") +
  tm_layout(main.title = "Primary Schools",
            main.title.size = 1.10,
            main.title.position = "center")

head(primarySch_ranking_sf, 5)
Simple feature collection with 5 features and 16 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 19496.7 ymin: 28588.52 xmax: 28735.3 ymax: 32798.72
Projected CRS: SVY21 / Singapore TM
  postalCode rank                  schoolName schoolZone
1     088256  182   Cantonment Primary School      South
2     099757  101              Chij (Kellock)      South
3     099840   27    Radin Mas Primary School      South
4     109100  168 Blangah Rise Primary School      South
5     128104   33         Qifa Primary School       West
                   address  schoolArea schoolStatus         affiliatedSchools
1       1 Cantonment Close Bukit Merah                                       
2      1 Bukit Teresa Road Bukit Merah              chij-st-theresa-s-convent
3    1 Bukit Purmei Avenue Bukit Merah                                       
4 91 Telok Blangah Heights Bukit Merah                                       
5     50 West Coast Avenue    Clementi                                       
  admissionRank sportsRank artsRank uniformedGroupsRank
1           132        153      163                 153
2           104         29       82                 163
3            10         74       83                  13
4           132        167      173                  61
5            61         26       10                 106
                   SCHOOLNAME SCH_HSE_BLK_NUM HSE_BLK_NUM         SCH_ROAD_NAME
1   CANTONMENT PRIMARY SCHOOL               1           1      CANTONMENT CLOSE
2              CHIJ (KELLOCK)               1           1     BUKIT TERESA ROAD
3    RADIN MAS PRIMARY SCHOOL               1           1   BUKIT PURMEI AVENUE
4 BLANGAH RISE PRIMARY SCHOOL              91          91 TELOK BLANGAH HEIGHTS
5         QIFA PRIMARY SCHOOL              50          50     WEST COAST AVENUE
                   geometry
1  POINT (28735.3 28695.24)
2 POINT (27413.63 28588.52)
3 POINT (26951.05 28619.79)
4 POINT (25215.83 28728.96)
5  POINT (19496.7 32798.72)

7.1.10 Supermarkets

tm_shape(subzone_sf) +
  tm_fill() +
  tm_polygons(alpha = 0.5) +
  tm_borders(alpha = 0.5) +
tm_shape(supermarkets_sf) + 
  tm_dots(size = 0.15,
          shape = 24,
          col = "#37F7C5") +
  tm_layout(main.title = "Supermarkets",
            main.title.size = 1.10,
            main.title.position = "center")

head(supermarkets_sf, 5)
Simple feature collection with 5 features and 8 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 32184.01 ymin: 32947.46 xmax: 41320.3 ymax: 42685.17
Projected CRS: SVY21 / Singapore TM
                                     LIC_NAME BLK_HOUSE             STR_NAME
1 LI LI CHENG SUPERMARKET (PUNGGOL) PTE. LTD.      273C        PUNGGOL PLACE
2             SHENG SIONG SUPERMARKET PTE LTD        11 UPPER BOON KENG ROAD
3       COLD STORAGE SINGAPORE (1983) PTE LTD       683     HOUGANG AVENUE 8
4       COLD STORAGE SINGAPORE (1983) PTE LTD       631 BEDOK RESERVOIR ROAD
5                     YES SUPERMARKET PTE LTD      201B   TAMPINES STREET 21
  UNIT_NO POSTCODE      LIC_NO          INC_CRC FMEL_UPD_D
1     884   823273 NE12I65N000 3DE8AF6E76F9D3D4 2017-11-29
2     901   380011  E73010V000 F361759A8261CD6E 2017-11-29
3     903   530683 NE11909C000 1DC69902E02077CE 2017-11-29
4     954   470631  S02210X000 4E2560154B58BA38 2017-11-29
5    1091   522201  S02037J000 559A9A00D9FF8A55 2017-11-29
                   geometry
1 POINT (35561.22 42685.17)
2 POINT (32184.01 32947.46)
3 POINT (33903.48 39480.46)
4 POINT (37083.82 35017.47)
5  POINT (41320.3 37283.82)

8 OLS Multiple Linear Regression Model

8.1 Filtering

8.1.1 Train

train_resale_prices <- final_resale_prices %>% filter(between(date, as.Date('2022-06-01'), as.Date('2022-12-01')))
train_resale_prices <- train_resale_prices[, c(7, 11, 13, 15, 17:34, 36)]

8.1.2 Test

test_resale_prices <- final_resale_prices %>% filter(between(date, as.Date('2023-01-01'), as.Date('2023-02-01')))
test_resale_prices <- test_resale_prices[, c(7, 11, 13, 15, 17:34, 36)]
total_resale_prices <- final_resale_prices[, c(7, 11, 13, 15, 17:34, 36)]
st_geometry(total_resale_prices) <- NULL

8.2 Visualising the relationships of the independent variables

corrplot(cor(total_resale_prices), diag = FALSE, order = "AOE",
         tl.pos = "td", tl.cex = 0.5, method = "number", type = "upper")

Looks like age of the HDB flats and remaining lease are practically inversely correlated, which makes sense. So one will be removed, age will be removed.

set.seed(1234)
resale_price.mlr <- lm(formula = resale_price ~ floor_area_sqm + HIP + MUP + PROX_CBD + PROX_HAWKER +  PROX_ELDERCARE + PROX_KINDERGARTEN + PROX_CHILDCARE + PROX_BUSSTOP + PROX_MRT +  PROX_SUPERMARKET + PROX_MALL + PROX_TOPSCH + NUM_KINDERGARTEN + NUM_CHILDCARE + NUM_BUSSTOP + NUM_TOPSCH, data = train_resale_prices)
ols_regress(resale_price.mlr)
set.seed(1234)
resale_price.mlr <- lm(formula = resale_price ~ floor_area_sqm + HIP + MUP + PROX_CBD + PROX_HAWKER +  PROX_ELDERCARE + PROX_CHILDCARE + PROX_SUPERMARKET + PROX_MALL + PROX_TOPSCH + NUM_KINDERGARTEN + NUM_CHILDCARE  + NUM_SCH + NUM_TOPSCH, data = train_resale_prices)
ols_regress(resale_price.mlr)
write_rds(resale_price.mlr, "data/model/resale_price.mlr.rds")
resale_price.mlr <- read_rds("data/model/resale_price.mlr.rds")
summary(resale_price.mlr)

Call:
lm(formula = resale_price ~ floor_area_sqm + HIP + MUP + PROX_CBD + 
    PROX_HAWKER + PROX_ELDERCARE + PROX_CHILDCARE + PROX_SUPERMARKET + 
    PROX_MALL + PROX_TOPSCH + NUM_KINDERGARTEN + NUM_CHILDCARE + 
    NUM_SCH + NUM_TOPSCH, data = train_resale_prices)

Residuals:
    Min      1Q  Median      3Q     Max 
-209212  -43894   -8586   26338  378127 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)        217932.0    15906.7  13.701  < 2e-16 ***
floor_area_sqm       4379.0      216.3  20.248  < 2e-16 ***
HIP                -49887.9     3054.0 -16.335  < 2e-16 ***
MUP                -72399.1     3552.9 -20.377  < 2e-16 ***
PROX_CBD         -1111345.4   464953.2  -2.390 0.016887 *  
PROX_HAWKER       -198233.7    31913.3  -6.212 5.82e-10 ***
PROX_ELDERCARE     -60342.5    34597.5  -1.744 0.081219 .  
PROX_CHILDCARE      -3350.9     2547.0  -1.316 0.188386    
PROX_SUPERMARKET   -50430.4     8520.0  -5.919 3.53e-09 ***
PROX_MALL         -182216.1    36952.6  -4.931 8.53e-07 ***
PROX_TOPSCH      -3131807.8   312900.9 -10.009  < 2e-16 ***
NUM_KINDERGARTEN    -5929.8     1667.8  -3.556 0.000382 ***
NUM_CHILDCARE        2194.7      764.8   2.870 0.004134 ** 
NUM_SCH              -766.1     1034.0  -0.741 0.458796    
NUM_TOPSCH          10097.1     2658.6   3.798 0.000148 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 74130 on 3744 degrees of freedom
Multiple R-squared:  0.2737,    Adjusted R-squared:  0.271 
F-statistic: 100.8 on 14 and 3744 DF,  p-value: < 2.2e-16

8.3 Preparing Publication Quality Table: gtsummary method

tbl_regression(resale_price.mlr, intercept = TRUE)
Characteristic Beta 95% CI1 p-value
(Intercept) 217,932 186,745, 249,119 <0.001
floor_area_sqm 4,379 3,955, 4,803 <0.001
HIP -49,888 -55,876, -43,900 <0.001
MUP -72,399 -79,365, -65,433 <0.001
PROX_CBD -1,111,345 -2,022,932, -199,759 0.017
PROX_HAWKER -198,234 -260,803, -135,665 <0.001
PROX_ELDERCARE -60,342 -128,174, 7,489 0.081
PROX_CHILDCARE -3,351 -8,345, 1,643 0.2
PROX_SUPERMARKET -50,430 -67,135, -33,726 <0.001
PROX_MALL -182,216 -254,665, -109,767 <0.001
PROX_TOPSCH -3,131,808 -3,745,281, -2,518,335 <0.001
NUM_KINDERGARTEN -5,930 -9,200, -2,660 <0.001
NUM_CHILDCARE 2,195 695, 3,694 0.004
NUM_SCH -766 -2,793, 1,261 0.5
NUM_TOPSCH 10,097 4,885, 15,310 <0.001
1 CI = Confidence Interval
tbl_regression(resale_price.mlr, 
               intercept = TRUE) %>% 
  add_glance_source_note(
    label = list(sigma ~ "\U03C3"),
    include = c(r.squared, adj.r.squared, 
                AIC, statistic,
                p.value, sigma))
Characteristic Beta 95% CI1 p-value
(Intercept) 217,932 186,745, 249,119 <0.001
floor_area_sqm 4,379 3,955, 4,803 <0.001
HIP -49,888 -55,876, -43,900 <0.001
MUP -72,399 -79,365, -65,433 <0.001
PROX_CBD -1,111,345 -2,022,932, -199,759 0.017
PROX_HAWKER -198,234 -260,803, -135,665 <0.001
PROX_ELDERCARE -60,342 -128,174, 7,489 0.081
PROX_CHILDCARE -3,351 -8,345, 1,643 0.2
PROX_SUPERMARKET -50,430 -67,135, -33,726 <0.001
PROX_MALL -182,216 -254,665, -109,767 <0.001
PROX_TOPSCH -3,131,808 -3,745,281, -2,518,335 <0.001
NUM_KINDERGARTEN -5,930 -9,200, -2,660 <0.001
NUM_CHILDCARE 2,195 695, 3,694 0.004
NUM_SCH -766 -2,793, 1,261 0.5
NUM_TOPSCH 10,097 4,885, 15,310 <0.001
R² = 0.274; Adjusted R² = 0.271; AIC = 94,988; Statistic = 101; p-value = <0.001; σ = 74,129
1 CI = Confidence Interval

8.4 Checking for multicolinearity

ols_vif_tol(resale_price.mlr)
          Variables Tolerance      VIF
1    floor_area_sqm 0.9431674 1.060257
2               HIP 0.8346136 1.198159
3               MUP 0.8375694 1.193931
4          PROX_CBD 0.4847280 2.063013
5       PROX_HAWKER 0.8941397 1.118393
6    PROX_ELDERCARE 0.9211740 1.085571
7    PROX_CHILDCARE 0.9362782 1.068059
8  PROX_SUPERMARKET 0.9730071 1.027742
9         PROX_MALL 0.5920959 1.688916
10      PROX_TOPSCH 0.4740046 2.109684
11 NUM_KINDERGARTEN 0.6802116 1.470131
12    NUM_CHILDCARE 0.6707035 1.490972
13          NUM_SCH 0.7610742 1.313932
14       NUM_TOPSCH 0.8840727 1.131129
ols_plot_resid_fit(resale_price.mlr)

ols_plot_resid_hist(resale_price.mlr)

8.5 Testing for Spatial Autocorrelation

resale_price.mlr.output <- as.data.frame(resale_price.mlr$residuals)
final_resale_price.mlr.sf <- cbind(train_resale_prices, resale_price.mlr.output) %>%
  rename(`MLR_RES` = `resale_price.mlr.residuals`)
final_resale_price.mlr.sp <- as_Spatial(final_resale_price.mlr.sf)
final_resale_price.mlr.sp
class       : SpatialPointsDataFrame 
features    : 3759 
extent      : 11597.97, 45154.27, 28097.64, 48622.47  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 24
names       : floor_area_sqm, resale_price, HIP, MUP, PROX_CBD, PROX_HAWKER, PROX_ELDERCARE, PROX_KINDERGARTEN, PROX_CHILDCARE, PROX_BUSSTOP, PROX_MRT, PROX_SCH, PROX_SUPERMARKET, PROX_MALL, PROX_TOPSCH, ... 
min values  :             51,       240000,   0,   0,    0.001,       0.002,          0.001,             0.001,          0.002,        0.005,    0.001,    0.001,            0.001,     0.003,       0.001, ... 
max values  :            137,       958000,   1,   1,    0.013,       0.125,          0.133,             0.438,          1.912,        5.138,    0.217,    0.181,            0.526,     0.167,        0.02, ... 

8.5.1 Map

tmap_mode("view")
tm_shape(subzone_sf)+
  tmap_options(check.and.fix = TRUE) +
  tm_polygons(alpha = 0.4) +
tm_shape(final_resale_price.mlr.sf) +  
  tm_dots(col = "MLR_RES",
          alpha = 0.6,
          style="quantile") +
  tm_view(set.zoom.limits = c(11,14))
tmap_mode("plot")
nb <- dnearneigh(coordinates(final_resale_price.mlr.sp), 0, 1500, longlat = FALSE)
nb_lw <- nb2listw(nb, style = 'W')
lm.morantest(resale_price.mlr, nb_lw)
mlr_pred <- predict.lm(resale_price.mlr, test_resale_prices)
write_rds(mlr_pred, "data/model/mlr_pred.rds")
mlr_pred <- read_rds("data/model/mlr_pred.rds")
mlr_pred_df <- as.data.frame(mlr_pred)
test_data_p_mlr <- cbind(test_resale_prices, mlr_pred_df)
Metrics::rmse(test_data_p_mlr$resale_price, 
     test_data_p_mlr$mlr_pred)
[1] 75888.6
OLS_scatterplot <- ggplot(data = test_data_p_mlr,
       aes(x = mlr_pred,
           y = resale_price)) +
  geom_point()

OLS_scatterplot

9 Building Hedonic Pricing Models using GWmodel

total_resale_prices <- final_resale_prices[, c(7, 11, 13, 15, 17:34, 36)]
coords <- st_coordinates(total_resale_prices)
coords_train <- st_coordinates(train_resale_prices)
coords_test <- st_coordinates(test_resale_prices)

9.1 Building Fixed Bandwidth GWR Model

9.1.1 Computing fixed bandwith

resale_price.sp <- as_Spatial(train_resale_prices)
set.seed(1234)
bw_adaptive <- bw.gwr(resale_price ~ floor_area_sqm + HIP + MUP + PROX_CBD + PROX_HAWKER +  PROX_ELDERCARE + PROX_KINDERGARTEN + PROX_CHILDCARE + PROX_BUSSTOP + PROX_MRT +  PROX_SUPERMARKET + PROX_MALL + PROX_TOPSCH + NUM_KINDERGARTEN + NUM_CHILDCARE + NUM_BUSSTOP + NUM_TOPSCH, 
                      data=resale_price.sp, 
                      approach="CV", 
                      kernel="gaussian", 
                      adaptive=TRUE, 
                      longlat=FALSE)

As you can see from above, the most suitable bandwidth is 208.

train_resale_prices <- train_resale_prices %>% 
  st_drop_geometry()
train_resale_prices <- train_resale_prices %>% 
  st_drop_geometry()
set.seed(1234)
rf <- ranger(resale_price ~ floor_area_sqm + HIP + MUP + PROX_CBD + PROX_HAWKER +  PROX_ELDERCARE + PROX_KINDERGARTEN + PROX_CHILDCARE + PROX_BUSSTOP + PROX_MRT +  PROX_SUPERMARKET + PROX_MALL + PROX_TOPSCH + NUM_KINDERGARTEN + NUM_CHILDCARE + NUM_BUSSTOP + NUM_TOPSCH,
             data=train_resale_prices)
print(rf)
set.seed(1234)
grf_adaptive <- grf(formula = resale_price ~ floor_area_sqm + HIP + MUP + PROX_CBD + PROX_HAWKER +  PROX_ELDERCARE + PROX_KINDERGARTEN + PROX_CHILDCARE + PROX_SUPERMARKET + PROX_MALL + PROX_TOPSCH + NUM_KINDERGARTEN + NUM_CHILDCARE + NUM_BUSSTOP + NUM_SCH + NUM_TOPSCH, 
                     dframe=train_resale_prices,
                     bw=bw_adaptive,
                     kernel="adaptive",
                     coords=coords_train,
                      ntree = 30)
write_rds(grf_adaptive, "data/model/grf_adaptive.rds")
grf_adaptive <- read_rds("data/model/grf_adaptive.rds")
test_data <- cbind(test_resale_prices, coords_test) %>%
  st_drop_geometry()
gwRF_pred <- predict.grf(grf_adaptive, 
                           test_data, 
                           x.var.name="X",
                           y.var.name="Y", 
                           local.w=1,
                           global.w=0)
gwRF_pred <- write_rds(gwRF_pred, "data/model/gwRF_pred.rds")
gwRF_pred <- read_rds("data/model/gwRF_pred.rds")
gwRF_pred_df <- as.data.frame(gwRF_pred)
test_data_p <- cbind(test_data, gwRF_pred)
write_rds(test_data_p, "data/model/test_data_p.rds")
test_data_p <- read_rds("data/model/test_data_p.rds")
test_data_p <- as.data.frame(test_data_p)
Metrics::rmse(test_data_p$resale_price, 
     test_data_p$gwRF_pred)
[1] 33323.73
ggplot(data = test_data_p,
       aes(x = gwRF_pred,
           y = resale_price)) +
  geom_point()